Knowledge Acquisition from Texts: Using an Automatic Clustering Method Based on Noun-Modifier Relationship
نویسنده
چکیده
We describe the early stage of our methodology of knowledge acquisition from technical texts. First, a partial morpho-syntactic analysis is performed to extract "candi-date terms". Then, the knowledge engineer , assisted by an automatic clustering tool, builds the "conceptual fields" of the domain. We focus on this conceptual analysis stage, describe the data prepared from the results of the morpho-syntactic analysis and show the results of the clustering module and their interpretation. We found that syntactic links represent good descrip-tors for candidate terms clustering since the clusters are often easily interpreted as "conceptual fields".
منابع مشابه
Semi-Automatic Recognition of Noun Modifier Relationships
Semantic relationships among words and phrases are often marked by explicit syntactic or lexical clues that help recognize such relationships in texts. Within complex nominals, however, few overt clues are available. Systems that analyze such nominals must compensate for the lack of surface clues with other information. One way is to load the system with lexical semantics for nouns or adjective...
متن کاملUsing Word Similarity Lists For Resolving Indirect Anaphora
In this work we test the use of word similarity lists for anaphora resolution in Portuguese corpora. We applied an automatic lexical acquisition technique over parsed texts to identify semantically similar words. After that, we made use of this lexical knowledge to resolve coreferent definite descriptions where the head-noun of the anaphor is different from the head-noun of its antecedent, whic...
متن کاملA Comparative Study of Nominalization in an English Applied Linguistics Textbook and its Persian Translation
Among the linguistic resources for creating grammatical metaphor, nominalization rewords processes and properties metaphorically as nouns within the experiential metafunction of language. Following Halliday's (1998a) classification of grammatical metaphor, the current study investigated nominalization exploited in an English applied linguistics textbook and its corresponding Persian translati...
متن کاملAutomatic Interpretation of Noun Compounds Using WordNet Similarity
The paper introduces a method for interpreting novel noun compounds with semantic relations. The method is built around word similarity with pretagged noun compounds, based on WordNet::Similarity. Over 1,088 training instances and 1,081 test instances from the Wall Street Journal in the Penn Treebank, the proposed method was able to correctly classify 53.3% of the test noun compounds. We also i...
متن کاملAn Endogeneous Corpus-Based Method for Structural Noun Phrase Disambiguation
In this paper, we describe a method for structural noun phrase disambiguation which mainly relies on the examination of the text corpus under analysis and doesn't need to integrate any domain-dependent lexicoor syntactico-semantic information. This method is implemented in the Terminology Extraction Sotware LEXTER. We first explain why the integration of LEXTER in the LEXTER-K project, which ai...
متن کامل